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The use of computers to build diagnostic inferences 
is presented in two contexts: (1) closed world, exemplified by the 
space shuttle launch monitoring system; and (2) open world, 
represented by computerized diagnostic testing of reading 
comprehension. The analysis shows that the closed world provides a 
substantially cleaner environment within which to perform diagnostic 
inference. In the case of educational diagnosis, most domains tend to 
be relatively open-ended, and thus no comparable clarity can be 
found. If the test materials for computerized administration can be 
designed within tightly controlled parameters, and if the diagnostic 
strategy can be strongly tied to theory about performance errors 
within the topic domain, then many cf the ambiguities of diagnostic 
inference will be closer to resolution. The computer has proved 
itself valuable in managing more traditional varieties of educational 
test administration and scoring. Properly programmed, the computer 
can become an unparalleled asset in the context of diagnostic 
testing. (LMO) 
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APPLIED STUDIES IN COMPUTERIZED DIAGNOSTIC TESTING: 
IMPLICATIONS FOR PRACTICE 

ABSTRACT 

The use of computers to build diagnostic Inferences 1s explored 1n two 
contexts. In computerized monitoring of liquid oxygen systems for the 
space shuttle, diagnoses are exact because they can be derived within a 
world which 1s closed. In computerized classroom testing of reading 
comprehension, programs deliver a constrained form of adaptive testing and 
error performance summary. However, the world 1s open: diagnostic 
Inferences cannot be made with precision, and addltlw.ial practical factors 
play an Important role 1n delimiting the usefulness of such a system. 
Problems of uncertainty, negation, and nondeterm1n1st1c prediction are also 
discussed. 
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Introduction 

Because of modern computer hardware and software, an Intelligent 
system for diagnostic testing which Incorporates the advantages of 
computerized management with the latest theoretical developments 1n 
diagnostic test strategy 1s no longer locked 1n the world of science 
fiction. In theory, a small computer could manage an Individualized 
adaptive testing session, drawing on a bank of dlagnostlcally relevant test 
Items, making real-time decisions about competing diagnostic hypotheses 
based on the Incoming stream of responses. In theory, even 1f premised on 
a rough set of diagnostic Indicators, such a system ought to generate a 
functional summary of the performance of an examinee. Because the task of 
diagnosis 1n Its most elementary form 1s simply one of Identifying 
consistent patterns of examinee behavior. 1t seems an Ideal task for the 
computer. 

In reality, of course, neither does the naive view of the diagnostic 
process portrayed above hold true for a moment, nor does blind application 
of high-technology computer programming circumvent an array of decisions 
about tne nature of performance and Its context, the structure of 
performance testing, and a virtual guarantee of multiple uncertainties 1n 
Interpretation. Important problems arise 1n programming a 
pattern-diagnostic Inferencer to diagnose performance as 1t occurs. 1n 
operating that program, and 1n deriving meaningful diagnostics from Its 
outcomes. Finally, even 1n the best of circumstances. Improvements 1n the 
computabHUy of diagnostic testing hinge on developments 1n computer 
software and diagnostic theory which have yet to occur. 
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Lest the reader feel that this viewpoint 1s unduly pessimistic, we 
note thn computerized diagnostic testing 1s functioning at this moment 1n 
fields as diverse as reading comprehension and the launching of space 
vehicles. Their salient features and extensions to educational diagnostic 
testing using computers, are the subject of this paper. Because of Its 
success, the space shuttle diagnostic system can be used to Illustrate 
critically Important conceptual underpinnings of the diagnostic process, 
which are generally lacking from diagnostic strategies 1n education and 
psychology. 

The earliest attempt at what would now be called computerized 
diagnostics was generated by a "teaching-learning machine" designed by 
Pressey (1926). A rachet-drlven device, not unlike a manual typewriter, 
presented selected test Items 1n a viewing window; responses were made on a 
specialized keyboard and scored mechanically. The process was envisioned 
as labor-saving, to "leave the teacher more free for her most Important 
work, for developing 1n her pupils fine enthusiasms, clear thinking, and 
high Ideals" (p. 376). More recent work In computerized diagnostic testing 
In educational settings has been discussed 1n Bejar (1984), McArthur an. 
Cabello (1985), McArthur and Choppln (1984), Mitchell (1982). and Schwartz 
(198'). In reference to computerized psychological testing, Ro1d (1985) 
presents an extended overall analysis, though sketchy on the Issue of 
diagnosis. To summarize, all these writers agree that the potential of 
computers applied to the particular tasks of administering scoring and 
supplying the bases for test Interpretation looks genuinely good. Indeed, 
a large amount of computer code to accomplish computer-managed testing 1s 



ERIC 



6 



3 



Included 1n Schwartz's book, and several commercial test publishers have 
begun marketing aggressively 1n this area. 

One reason for optimism 1s the power of the latest generation of small 
computers. Computer hardware was formerly a major bottleneck 1n 
Implementing computerized diagnosis. Not long ago, few machines were 
capable of handling the job without severe restrictions on speed, memory, 
storage, and ancillary capabilities. In less than a decade, computer 
technology has leapt forward In ways which now allow extraordinarily 
complex logical and mathematical operations to be Implemented very 
rapidly. Most restrictions that used to apply are gone, since reliable 
hardware can now Include not only keyboard and video display, but also 
voice synthesizer, voice recognition device, and real-time graphics. 
Highly veridical problem simulations are now possible. Alternatively, 1f 
the testing 1s only a matter of presenting text to an examinee and waiting 
for a keystroke response, then modern lap-top computers suffice nicely. In 
sum, hardware no longer poses a significant barrier to the development of 
diagnostic tools. 

The task of computerized diagnosis 1s much more demanding on computer 
software. Both logical and mathematical operations must work 1n an 
e vlronment of real-time (respond now to this test Item) and periodic (save 
the examinee's response pattern 1n long-term memory) operations. 
Fortunately, a number of programming languages are equipped to handle these 
composite requirements. An Important software problem which 1s more 
difficult to solve 1s the handling of exception conditions. Exceptions 
occur when the program encounters some action or data which 1t Is not 
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prepared to handle. While some languages return a nil, a default, or en 
explicit "don't know," others respond to that event with total program 
failure. When exception handling 1s added to the requirements noted above, 
no single programming language emerges as the perfect software vehicle for 
computerized testing. Ideally, a real-time oriented language for 
programming of computerized diagnosis would Include extensive facilities 
for error-trapping as well as both symbolic manipulation and arithmetic 
computation. Even advanced languages like Modula-2, C, and LISP, and CAI 
production systems like PILOT, Incorporate only some solutions to these 
Issues, so the final choice awaits further developments 1n software 
technology. 

A Closed-World Diagnostic Inferencer 
The space shuttle launch monitoring system described by Scarl, 
Jamleson and Delaune (1985) serves as an excellent model for computerized 
diagnostics, on the one hand because 1t 1s highly effective and on the 
other because the world 1n which 1t works 1s well formulated. Space 
shuttles are launched under extraordinarily tight controls, with thousands 
of critical Indicators being monitored and evaluated continuously by 
computer. Recently, the monitoring of liquid oxygen activity (valves, 
pipes, tanks, flow rates, pressures and the like) has been accomplished by 
a computerized expert diagnostic system, operating as an Intelligent 
watchdog, with the capability of quickly Isolating and Interpreting any 
error anywhere within Its purview. Its diagnostic strategy 1s strongly 
predicated on the notion that a truly simultaneous occurence of Independent 
errors 1s highly unlikely. Far more probable 1s a failure 1n some 
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component which has consequences felt Immediately or soon thereafter 1n 
several places further along the chain. 

What happens when the liquid oxygen expert "discovers" that one or 
several sensors are reporting values out of normal range 1s what makes 1t 
an excellent expert to study from the point of view of diagnosis. In the 
simplest case, the system receives Indication of a single-point 
error: a single sensor registers abnormally high or low. The expert 
"knows" enough to assess the degree of critical 1ty of that component, and 
to "understand" whether failure at that point 1n the complex array of 
liquid oxygen circuitry should have consequences felt downstream by other 
sensors. If all downstream Indicators are reporting clear the most likely 
explanation of this single erroneous Indication 1s sensor failure. 

If, on the other hand, a cluster of errors Is suddenly reported 
together, the expert evaluates the root cause of such multi-point failure 
1n two ways. The first 1s a method of set Intersections, using assumptions 
about the state 1n which matters would have to be 1n order to produce 
those values being received at this time. The second 1s a method based on 
simultaneous hypothesis testing, using the logic of a propagating error 
tree which 1s tested 1n Increasing depth until a point source of the error 
1s Isolated. 

The liquid oxygen diagnostic system operates In a closed world. Its 
sensors cover the entire domain, and errors within that domain are 
registered unambiguously. As long as the system's programmers have 
properly placed each sensor and have accounted for any unique operating 
characteristics or "quirks", no rrror of any consequence whatsoever will go 
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undetected. For such a system, the world beyond Its sensors need not be 
considered because every plausible diagnostic possibility has already been 
Included. 

Assumptions for the Closed World 
The requirement for a dosel-world diagnostic system 1s a test domain 
1n which all possible faults may be enumerated discretely, and 1n which 
each single source of data may be pegged. 1n advance, as to Its range of 
reporting values. The Introduction of a single fault not contained 1n the 
11st of known faults, or of a single datum of unknown character, exceeds 
the closed world and destroys Its advantage. A key part of the advantage 
of a completely closed world 1s that all of the operating characteristics 
of that world can be known exactly. They need not be explicated In 
entirety but rather by virtue of their availability, the closed-world 
Inferencer has the resources to evaluate any plausible permutation of 
events. 

Suppose we are Interested 1n diagnosing faults 1n a contained domain 
like the liquid oxygen system with the constraint that we do not yet know 
exact tolerances for many of the sensors. We could proceed by discarding 
the evidence shown by those sensors altogether and Instead use only those 
pieces of evidence about which we have advance knowledge as to Its shape. 
We could allow the shuttle to be launched under a series of controlled 
trials regardless of the data until we amass a repertoire of Interrelated 
cause-and-effect relationships between sensor reports and final outcome, 
using that experience to build a library of allowable values. We could 
attempt to corroborate the multiple data from sensors from the present 
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shuttle with salient aspects of previous experience, then use such 
experience as a selective and conditional guide to completing the present 
task. Multiple avenues could be productively explored given enough 
resources, such that a suitable diagnostic evaluation eventually could be 
made of liquid oxygen system activity. Obviously, however, the operational 
advantage lies with the system which need not engage 1n strictly 
exploratory behaviors before being able to form diagnostics conclusions. 

Two additional considerations must also be made: the first concerns 
the nonlntermlttency of signals while the second concerns the granularity 
of the data received. Nonlntermlttency 1s a strong assumption within the 
liquid oxygen diagnostic system. While each sensor 1s capable of 
generating a continuous stream of data, sensing of the status of any given 
sensor occurs at discrete Intervals. Any sensed value 1s expected to be 
regular. That 1s, stable readings are seen as far more likely, from the 
point of view of diagnostic Interpretation, than are wild fluctuations 
within short Intervals. Indeed, Intermittent fluctuations are more readily 
Intcrpretable as sensor errors and noise than as dlagnostlcally relevant 
Indicators. 

The second additional consideration 1s that the data received from the 
various sensors are at the functional granularity demanded by the 
diagnostic Inference process. This means, on one hand, that no aggregation 
of Incoming values need be made prior to using them dlagnostlcally, and, on 
the other, that no step 1n the diagnostic process will require finer shades 
of data than are being delivered. The granularity of data, 1n this 
Instance, 1s optimal. 
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Diagnostic Inference 1n an Open World 
While there are several approaches which have been taken to building 
computerized diagnostic testing in the domain of reading comprehension, 
few of the assumptions used 1n closed world diagnostics carry over. 
Reading comprehension 1s a domain for which few theorists, 1f any, have 
attempted to formalize all of the likely diagnostic Indicators ot erroneous 
performance. The diagnostic Inference process 1n reading comprehension, 
even as practiced by professionals, represents mo.-e than simple rules of 
procedure and logical chaining of consequences. Characteristically the 
process reflects an accumulation of overlapping evidence, pi js both common 
sense and, for lack of c better descriptor, professional acumen. Neither 
of the latter factors are especially amenable to computerization. 
Nonetheless, computer programs now exist which are capable of adaptlvely 
presenting a limited scope of reading test Items and deriving l»*om the 
response pattern a composite error summary. 

In a domain such as reading comprehension, the scope of a student's 
misunderstanding can be quite large, so large as to make 1t exceedingly 
difficult to predict all possible errors. The likelihood of a single-point 
error 1s si 1m, since so few errors 1n reading are unitary. Most often an 
examinee will demonstrate multiple errors, yet Isolation of a single cause 
of a multi-point error cannot use a system of tracing en or propogatlon 
because no theory of reading yet Includes one. It 1s rather unlikely that 
the data from such testing can be construed as always nonintermlttent, and 
<iven more unlikely that the raw responses are at an optimal level of 
granularity. For computerized diagnostic testing of reading comprehension 
the benefits of closed world assumptions do not hold. The relatively 
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simple rul?s which allow noisy data to be ~ast out cannot be applied. Even 
1n the most carefully prepared of the present systems for appraising 
reading comprehension skills, what constitutes a dlagnostlcally useful 
pattern of erroneous responses 1s nut completely resolved. 

Our experience with a dedicate* test admlnlstrat' on/feedback system 
and a test of reading comprehension skills specifically designed arouod 
diagnostic principles demonstrates some positive outcomes despite the 
concerns portrayed above. One hundred and sixteen upper primary pupils 
were given a pair of brief computer-managed tests, one a test of pronoun 
usage and the other a reading test Involving short essays. Items were 
calibrated by difficulty and Item dlstractors were keyed as to the type of 
error each reflected. Movement from Items of moderate difficulty to Items 
of greater or lesser difficulty were controlled by real-time appraisal of 
examinee performance. This movement up or down was significantly related 
to examinee grade level and reading skill. An examinee's movement up or 
down 1n one test was generally corroborated by the same movement up or down 
1n the other test. The pronoun test, which represents one of the more 
closed worlds In reading skills, showed a fair degree of performance 
consistency within examinees, and a balanced and logical distribution of 
error types by skill level across examinees. The comprehension test, 
representing a more open-world domain, showed a somewhat less logical error 
pattern overall: students who evidently had the capacity to properly 
answer Items at a middle range of Item difficulty frequently stumbled on 
simple errors of literal comprehension when answering more difficult 
Items. 
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Practical Requirements 
Our testing with a prototype diagnostic Inferencer In classroom 
settings suggests that a numoer of the troubles noted In reference to 
diagnostic strategy In the open world can be favorably resolved. Heavy 
emphasis must be given to designing a test which adequately covers the full 
scope of a given topic, and does so with Items which possess good 
psychometric qualities and well-fon i error categories. Based on 
technical considerations, It should be pointed out that there are other 
requirements that dictate the nature of a test which would be suitable for 
computer-managed diagnostic testing In education or psychology. 

The first requirement Is that the test match the operations of the 
computer: Its text must fit within an available screen window, its tasks 
(type your name, hit this response key) must be unambiguous to the user, 
Its options (stride this key to go forward or that key to go backward) must 
be exceedingly clear. The default Instructions for taking a 
paper-and-pencll test, so thoroughly Ingrained In most students by habit 
alone, are not automatically transferred by students to computer testing. 
For s< -dents who falter as *> to type their name at the keyboard 
before the test begins, frustration already mounts. 

Second, the test as a whole must be user-safe. That Is, the software 
i"<st be "flre-wOled". *t the opening requirement that the student type 
their name at the keyboard, there are dozens of possible variations which 
must be distinguished by the computer from an Incomplete or erroneous 
attempt; the software must only move forward when the student Is ready. No 
matter what logical or Illogical key sequence Is pressed, the software must 
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be able to separate a legitimate response from careless keystrokes or 
lightly maddous attempts to "fool the system." Few students will try the 
latter, but nonetheless a computerized test which succumbs to one pupil's 
errant behavior will stand little chance of surviving the remainder of the 
allotted time period, simply because students find killing the system a 
great deal more Interesting than completing the test. 

Third, the test must be 1ntt 1ns1cally Interesting, separate from the 
novelty of Its appearance on a vldeoscreen. Because the experience and 
excitement of videogames 1s almost universal among school children even at 
the lowest grades, expectations of what a computer will do are now 
jaundiced. Repeated presentation of chunks of text on screen, with a 
single keystroke as the sole behavior required of the student, 1s deadly. 
Some students may find themselves striking a response key simply to make 
the screen do something — anything; 1n that Instance, all of the ordinary 
concerns about random and partially random responding to test Items are 
exacerbated. The results of computerized testing and any ensuing 
diagnostic Interpretations become uncertain at best. 

Models for Handling Uncertainty 

A diagnostic Inferencer which works 1n anything but a completely 
closed-world environment runs headlong Into Issues of uncertainty. The 
variety of ways 1n which uncertainty can be handled statistically and 
probabilistically suggests that no single solution suffices. Pearl (1984) 
describes detailed logical and mathematical approaches to the task using an 
approach which stems from a Bayeslan tradition. Prade (1985) models 
Imprecision and uncertainty with a deductive system modeled on fuzzy-set 
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logic. Reggla, Perrlcone, Nau, and Peng (1985) delineate an abductlve 
Inference process 1n which plausible causal associations are derived 
sequentially by testing symbolic conditional probabilities through set 
theory. 

In the context of diagnostic testing, the ways 1n which uncertainty 1s 
managed are crucial to the diagnostic outcome. An example of this 1s 
negation, deciding when a particular diagnostic hypothesis 1s no longer 
viable. Cohen's (1984) multiple definitions of rule endorsement and 
negation form a case 1n point. The weakest Interpretation of negation 
(called "ostrich") allows a hypothesis to be negated 1f positive evidence 
In support of the hypothesis does not currently appear 1n the data. The 
strongest interpretation of negation (called "hard-not)" requires hard 
evidence 1n support of the negation of a hypothesis or 1n support of the 
hypothesis' opposite be present 1n the data. A closed-world assumption 
requires that evidence for negation of the hypothesis be present or that a 
proof offered 1n support of the hypothesis falls. 

Almost all of the operations undertaken 1n a diagnostic test 1n 
education are subject to negation at one point or another. 
The task of diagnostic testing can be Interpreted as an exploration of 
competing hypotheses and a weeding out of those hypotheses which are not 
receiving support. Because of the uncertainties Inherent 1n test 
responses, the removal of a plausible hypothesis from the set of hypotheses 
under study 1s seldom matched by strong evidence. Most frequently, one has 
to make use of weak negative Information to place that hypothesis on the 
back burner, then rely on good luck to Isolate dlagnostlcally Important 
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Information from the hypotheses which remain. The problem 1s one of 
logistics as well as mathematics: how can one optimize the selection of 
avenues to explore without predicting that some paths are likely to be less 
fruitful than others? In diagnostic testing 1n an open-world environment, 
these predictions are very difficult to make. 

In a parallel domain, a recent contribution to the field of artificial 
Intelligence sets out a series of transformation methodologies to 
determine sequence-generating rules (D1etter1ch & M1chalsk1, 1985). The 
problem, a direct analogue of real-time diagnostic testing, 1s one of 
predicting future behavior by looking back, 1n varying degrees of depth, at 
the evidence so far — that 1s, estimating what constitutes a meaningful 
summary pattern and expectation for the next behavior 1n sequence based on 
some or all of what has gone before. The simplest version of this problem 
occurs when the next behavior (or object, or response) 1s one which 1s 
totally predetermined by every attribute associated with all past objects. 
All of the attributes 1n the preceding string of evidence can be used to 
form a perfect prediction of what will come next; the methodological 
problem reduces to counting the total number of distinct attributes. 

In a nondeterm1n1st1c prediction problem, the occurence of the next 
piece of evidence may or may not entirely fit the string of evidence 
collected to date. Certain subsets of attributes may play more significant 
roles In determining the next evidence than others. The goal 1s to find 
plausible and parsimonious descriptors of key patterns underlying the 
evidence. This 1s a close equivalent to the kind of diagnostic process 
seen 1n the non-closed-world systems discussed earlier. The methodological 
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problems are significant, for the solution requires understanding which 
attributes of the evidence collected to date actually contribute 
meaningfully to behaviors, and which attributes are misleading, Irrelevant, 
and/or simply reflect random values. 

Oletterlch and Mlchalskl suggest that the solution to 
non-deterministic prediction might rely on any of three approaches to 
variable-valued logic calculus: disjunctive-normal modeling, decomposition 
modeling, or periodic modeling.* When objects or events In the opening 
sequence make use of a small set of finite-valued attributes, experimental 
evidence suggests that objects or events later In the sequence can be 
predicted well by any of the three models, looking backward to various 
depths at the preceding evidence. Given a stream of data which defies 
categorization, all of these models will either labor for extended periods 
of time without producing useful results, or "discover" one rule or another 
which fits the data badly. Unfortunately, current Implementations of all 
three methods suffer from the weakness that they do not attempt to evaluate 



*A disjunctive-normal model builds upon "the fewest number of conjunctive 
terms that covers all of the positive examples and none of the negative 
examples" (p. 219). An Iterative process generates an Increasing number of 
maximally-general expressions until no positive example remains which Is 
not already covered. The decomposition model Iterates trial versions of 
generalizations of attributes among pieces of evidence. Its Intermediate 
results are then tested against the negative evidence found In the data, 
and It concludes only when the decomposition succeeds In excluding all 
negative evidence. The periodic model expands on this latter approach, 
testing conjuncts of positive evidence and comparing the degrees of overlap 
between attribute selectors until some functional minimum of overlap Is 
reached, and at the same time no negative evidence remains Included by the 
hypotheses. 
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the best solutions first, they have no way to assess the plausibility of 
their results, and they cannot at present form composite models. 

Summery 

Computerization of the diagnostic testing process 1n education 1s a 
challenge of multiple dimensions, Including both operational aspects and 
philosophical underpinnings. Indeed, the question has been raised as to 
whether computer software can adequately represent the subtle but Important 
common-sense elements which come Into focus when the domain of Interest 1s 
not within a closed world (Bobrow & Hayes, 1985). As the preceding 
analysis has shown, the closed world provides a substantially cleaner 
environment within which to perform diagnostic Inference. In the case of 
educational diagnosis, most domains tend to be relatively open-ended and 
thus no comparable clarity can be found. 

If the test materials for computerized administration can be designed 
within tightly controlled parameters, and 1f the diagnostic strategy can be 
strongly tied to theory about performance errors within the topic domain, 
then many of the ambiguities of diagnostic inference will be closer to 
resolution. The algorithm that 1s used to select the next Item 1n sequence 
1s also critical: along with Item calibrations, a selection algorithm 
could use dlagnostlcally , /obatlve Items, items which are particularly 
suited to explore the examinee's misunderstanding of a given concept within 
the text. Ideally, too, the pattern of erroneous performance of an 
Individual respondent, Instant by Instant, could be analyzed 1n the context 
of similar patterns generated 1n previous testing sessions. 
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Yet to be solved Is the problem of diagnostic precision. Inherent In 
most educational topics are unstudied assumptions about the ways In which 
erroneous performance manifests Itself. Psychometric evidence makes clear 
that patterns of errors In multiple-choice test Items occur In complex ICC 
distributions which are neither directly Interpretable from theory, nor 
completely orthogonal to other traits of the test or the respondent. Thus, 
at present, even with the best of computerized testing, the veracity of 
diagnostic outcomes from computerized testing must be closely scrutinized. 

The computer has proved Itself valuable In managing more traditional 
varieties of educational test administration and scoring. Properly 
programmed, the computer can become an unparalleled asset In the context of 
diagnostic testing, If certain limits ire observed. Taken collectively, 
the sheer number of limits both of a philosophical nature and In reference 
to actual testing practice strongly suggest that the computer's role will 
be supplementary to the educational diagnostic specialist. Breakthroughs, 
however, could occur as soon as computer software moves Into Its next 
generation cf power, and as soon as educational theorists are able to bull i 
detailed models of misunderstanding. 
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